Solving average cost Markov decision processes by means of a two-phase time aggregation algorithm

نویسندگان

  • Edilson Fernandes de Arruda
  • Marcelo D. Fragoso
چکیده

This paper introduces a two-phase approach to solve average cost Markov decision processes, which is based on state space embedding or time aggregation. In the first phase, time aggregation is applied for policy optimization in a prescribed subset of the state space, and a novel result is applied to expand the evaluation to the whole state space. This evaluation is then used in the second phase in a policy improvement step, and the two phases are then alternated until convergence is attained. Some numerical experiments illustrate the results. 2014 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A time aggregation approach to Markov decision processes

We propose a time aggregation approach for the solution of infinite horizon average cost Markov decision processes via policy iteration. In this approach, policy update is only carried out when the process visits a subset of the state space. As in state aggregation, this approach leads to a reduced state space, which can lead to a substantial reduction in computational requirements, especially ...

متن کامل

A Multi-Mode Resource-Constrained Optimization of Time-Cost Trade-off Problems in Project Scheduling Using a Genetic Algorithm

In this paper, we present a genetic algorithm (GA) for optimization of a multi-mode resource constrained time cost trade off (MRCTCT) problem. The proposed GA, each activity has several operational modes and each mode identifies a possible executive time and cost of the activity. Beyond earlier studies on time-cost trade-off problem, in MRCTCT problem, resource requirements of each execution mo...

متن کامل

Exact Decomposition Approaches for Markov Decision Processes: A Survey

As classical methods are intractable for solving Markov decision processes MDPs requiring a large state space, decomposition and aggregation techniques are very useful to cope with large problems. These techniques are in general a special case of the classic Divide-and-Conquer framework to split a large, unwieldy problem into smaller components and solving the parts in order to construct the gl...

متن کامل

Accelerated decomposition techniques for large discounted Markov decision processes

Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...

متن کامل

An Adaptive Approach to Increase Accuracy of Forward Algorithm for Solving Evaluation Problems on Unstable Statistical Data Set

Nowadays, Hidden Markov models are extensively utilized for modeling stochastic processes. These models help researchers establish and implement the desired theoretical foundations using Markov algorithms such as Forward one. however, Using Stability hypothesis and the mean statistic for determining the values of Markov functions on unstable statistical data set has led to a significant reducti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • European Journal of Operational Research

دوره 240  شماره 

صفحات  -

تاریخ انتشار 2015